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Abstract 



A single-letter achievable rate region is proposed for the two-receiver discrete memoryless broadcast 
channel with generalized feedback. The coding strategy involves block-Markov superposition coding using 
Marten's coding scheme for the broadcast channel without feedback as the starting point. If the message 
rates in the Marton scheme are too high to be decoded at the end of a block, each receiver is left with 
a list of messages compatible with its output. Resolution information is sent in the following block to 
enable each receiver to resolve its list. The key observation is that the resolution information of the first 
receiver is correlated with that of the second. This correlated information is efficiently transmitted via joint 
source-channel coding, using ideas similar to the Han-Costa coding scheme. Using the result, we obtain an 
achievable rate region for the stochastically degraded AWGN broadcast channel with noisy feedback from 
only one receiver. It is shown that this region is strictly larger than the no-feedback capacity region. 

1 Introduction 

The two-receiver discrete memoryless broadcast channel (BC) is shown in Figure [ija) . The channel has one 
transmitter which generates a channel input AT, and two receivers which receive Y and Z, respectively. The 
channel is characterized by a conditional law Pyz\x- The transmitter wishes to communicate information 
simultaneously to the receivers at rates {Rq, Ri, R2), where Rq is the rate of the common message, and R2 
are the rates of the private messages of the two receivers. This channel has been studied extensively. The 
largest known set of achievable rates for this channel without feedback is due to Marton [ij. Marten's rate 
region is equal to the capacity region in all cases where it is known. (See for example, for a list of such 
channels.) 

Figure [ijb) shows a BC with generalized feedback. Sn represents the feedback signal available at the 
transmitter at time n. This model includes noiseless feedback from both receivers {Sn — (YmZn)), partial 
feedback (5„ — Yn) as well as noisy feedback {Sn = Yn + noise). El Gamal showed in [;3| that feedback 
does not enlarge the capacity region of a physically degraded BC. Later, through a simple example, Dueck [3] 
demonstrated that feedback can strictly improve the capacity region of a general BC. For the stochastically 
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Figure 1: The discrete memoryless broadcast channel with a) no feedback b) generahzed feedback. 

degraded AWGN broadcast channel with noiseless feedback, an achievable rate region larger than the no- 
feedback capacity region was established in [5] , and more recently, in [6] . A finite- letter achievable rate region 
(in terms of directed information) for the discrete memoryless BC with feedback was obtained by Kramer |7j ; 
using this characterization, it was shown that rates strictly outside the no-feedback capacity region could be 
achieved for the binary symmetric BC with noiseless feedback. 

In this paper, we establish a single-letter achievable rate region for the memoryless BC with generalized 
feedback. We use the proposed region to compute achievable rates for the stochastically degraded AWGN BC 
with noisy feedback from one receiver, and show that rates strictly outside the no-feedback capacity region 
can be achieved. 

Before describing our coding strategy, let us revisit the example from [4j. Consider the BC in Figure [2] 
The channel input is a binary triple {Xq, Xi, X2)- Xq is transmitted cleanly to both receivers. In addition, 
receiver 1 receives Xi © N and receiver 2 receives X2 © N, where N is an independent binary Bernoulli(^) 
noise variable. Here, the operation © denotes the modulo-two sum. Without feedback, the maximum sum 
rate for this channel is 1 bit/channel use, achieved by using the clean input Xq alone. In other words, no 
information can be reliably transmitted through inputs Xi and X2. 



(Xo,Xi©7V) 



X — {Xn, Xi, X2) 



Cliaimcl 



'--^z = (Xo,X2eN) 

Figure 2: The channel input is a binary triple {Xq, Xi, X2). N ^ Bernoulli(i) is an independent noise variable. 

Dueck described a simple scheme to achieve a greater sum rate using feedback. In the first channel use, 
transmit one bit to each receiver i through Xi, i = 1,2. Receivers 1 and 2 then receive Y — Xi (B N and 
Z = X2 © N, respectively, and cannot recover Xi. The transmitter learns Y, Z through feedback and can 
compute N = Y (B Xi = Z(BX2- For the next channel use, the transmitter sets Xq — N. Since Xq is received 
noiselessly by both receivers, receiver 1 can now recover Xi as Y (BN. Similarly, receiver 2 reconstructs X2 as 
Z (B N. We can repeat this idea over several transmissions: in each channel use, transmit a fresh pair of bits 
(through Xi,X2) as well as the noise realization of the previous channel use (through Xq). This yields a sum 
rate of 2 bits/channel use. This is, in fact, the sum-capacity of the channel since it equals the cut-set bound 
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maxp^ I{X;YZ). 

The example suggests a natural way to exploit feedback in a broadcast channel. If we transmit a block 
of information at rates outside the no-feedback capacity region, the receivers cannot uniquely decode their 
messages at the end of the block. Each receiver now has a list of codewords that are jointly typical with its 
channel output. In the next block, we attempt to resolve these lists at the two receivers. The key observation 
is that the resolution information needed by receiver 1 is in general correlated with the resolution information 
needed by receiver 2. The above example is an extreme case of this: the resolution information of the two 
receivers is identical, i.e., the correlation is perfect! 

In general, the two receivers' resolution information are not perfectly correlated, but can still be transmitted 
over the BC more efficiently than independent information. This is analogous to transmitting correlated 
sources over a BC using joint source-channel coding [8 -12 . At the heart of the proposed coding scheme is 



a way to represent the resolution information of the two receivers as a pair of correlated sources, which is 
then transmitted efficiently in the next block using joint source-channel coding, along the lines of [s]. We 
repeat this idea over several blocks of transmission, with each block containing independent fresh information 
superimposed over correlated resolution information for the previous block. 
The following are the main contributions of this paper: 

• We obtain a single-letter achievable rate region for the discrete memoryless BC with generalized feedback. 
The proposed region contains three extra random variables in addition to those in Marton's rate region. 

• Using a simpler form of the rate region with only one extra random variable, we compute achievable 
rates for the AWGN broadcast channel with noisy feedback. It is shown that rates outside the no- 
feedback capacity region can be achieved even with noisy feedback from only one receiver. This is the 
first characterization of achievable rates for the AWGN BC with noisy feedback at finite SNR, and is in 



contrast to the finding in 13 that noisy feedback does not increase the prelog of the sum-capacity as 
the SNR grows asymptotically large. 

One feature of the proposed region is that it includes the case where there a common message to be 
transmitted to both receivers, in addition to their private messages. The previously known schemes for 
the AWGN BC with noiseless feedback l5j[6] assume that there is no common message. 



• At the conference where our result was first presented [14| , another rate region for the BC with feedback 
was proposed independently by Shayevitz and Wigger [15|. Though a direct comparison of the two 



regions does not appear feasible, we show that the rates for the examples presented in 15 can also be 
obtained using the proposed region. 

Notation: We use uppercase letters to denote random variables, lower-case for their realizations and calli- 
graphic notation for their alphabets. Bold-face notation is used for random vectors. Unless otherwise stated, 
all vectors have length n. Thus A = A" = {Ai, . . . , An) represents a random vector, and a = a" = (ai, . . . , a„) 
a realization. The e-strongly typical set of block-length n of a random variable with distribution P is denoted 
Ai^\P). 6{e) is used to denote a generic positive function of e that goes to zero as e — > 0. Logarithms are 
with base 2, and entropy and mutual information are measured in bits. For a G (0, 1), a = 1 — a. © denotes 
modulo-two addition. 

In the following, we give an intuitive description of a two-phase coding scheme for communicating over a 
BC with noiseless feedback. We will use the notation ^ to indicate the random variables used in the first 



3 



phase. Thus (Y , Z) denotes the channel output pair for the first pliase, and {Y, Z) tlic output pair for the 
second phase. We start with Marton's coding strategy for the discrete memoryless BC without feedback. The 
message rates of the two receivers are assumed to he outside Marton's achievable rate region. Let U, V, and 
W denote the auxiliary random variables used to encode the information. W carries the information meant 
to be decoded at both receivers. U and V carry the rest of the information meant for the receivers 1 and 
2, respectively. The C/- and F-codebooks are constructed by randomly sampling the U- and V'-typical sets, 
respectively. Let U, V and W denote the three random codewords chosen by the transmitter. The channel 
input vector X is obtained by 'fusing' the triple (U, V,W). 

Since the rates lie outside Marton's region, the receivers are not able to decode the information contained 
in [/, V , and W. Instead, they can only produce a list of highly likely codewords given their respective channel 
output vectors. At the first decoder, this list is formed by collecting all (C/, W^)-codeword pairs that are jointly 
typical with the channel output. A similar list of {V, W^)-codeword pairs is formed at the second receiver. Note 
that even with feedback, the total transmission rate of the BC cannot exceed the capacity of the point-to-point 
channel with input X and outputs (Y, Z) (since the channel is memoryless). Hence, given both channel output 
vectors (Y,Z), the posterior probability of the codewords will be concentrated on the transmitted codeword 
triple. 

At the end of the first phase, the feedback vector S is available at the encoder. In the second phase, we 
treat (U, W) as the source of information to be transmitted to the first decoder, and (V, W) as the source of 
information to be transmitted to the second decoder. The objective in the second phase is to communicate 
these two correlated pairs over the BC, while treating S as source state information and Y and Z as side- 
information available at the two receivers. This is accomplished using a joint source-channel coding strategy. 
Transmission of correlated information over a BC has been addressed in |8 ! 
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In the Han-Costa framework [S], the correlated information is modeled as a pair of memoryless sources 
characterized by a fixed single-letter distribution. The pair of sources is first covered using codebooks con- 
structed from auxiliary random variables; the covering codewords are then transmitted over the BC using 
Marton coding. The current setup differs from 8 in two ways. First, the correlated information given by 
(U, W) and (V, W) does not exhibit a memoryless-source-like behavior. This is because the vectors U, V 
and W come from codebooks. However, when the codewords are sufficiently long and are chosen randomly, 
(U,V,W) will be jointly typical and can be covered using auxiliary codebooks similar to [sj. The second 
difference from [s] is the presence of source state information S and side-information Y and Z available at 
receivers 1 and 2, respectively. We handle this by extending both the covering and channel coding steps of the 
Han-Costa scheme to incorporate the side-information. Thus at the end of the second phase, the decoders are 
able to decode their respective messages. 

We will superimpose the two phases using a block-Markov strategy. The overall transmission scheme has 
several blocks, with fresh information entering in each block being decoded in the subsequent block. The fresh 
information gets encoded in the first phase, and is superimposed on the second phase which corresponds to 
information that entered in the previous block. 

It turns out that the performance of such a scheme cannot be directly captured by single-letter information 
quantities. This is because the state information, given by the channel outputs of all the previous blocks, keeps 
accumulating, leading to a different joint distribution of the random variables in each block. We address this 
issue by constraining the distributions used in the second phase (Definition 2.3) so that in every block, all the 
sequences follow a stationary joint distribution. This results in a first-order stationary Markov process of the 
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sequences across blocks. 

The rest of paper is organized as follows. In Section [2j we define the problem formally and state the main 
result of the paper, an achievable rate region for BC with generalized feedback. We give an outline of the 
proof of the coding theorem in Section |3] In Section |4] we use the proposed region to compute achievable rates 
for the AWGN BC with noisy feedback. We also compare our region with the one proposed by Shayevitz and 
Wigger. The formal proof of the coding theorem is given in Section [5j and Section [6] concludes the paper. 

2 Problem Statement and Main Result 

A two- user discrete memoryless broadcast channel with generalized feedback is a quintuple {X, y, Z, S, Pyzs\x) 
of input alphabet X, two output alphabets 3^, Z, feedback alphabet S and a set of probability distributions 
Pyzs\x{'\^) on 3^ X Z X 5 for every x G X. The channel satisfies the following conditions for all n = 1, 2, . . . 

Pr(r„ - y„, Z„ - z„, S = s„|X" - X, y"-i - y, Z'^-' - z, 5"-^ - s) = Pyz5|x(y«, ^n, s„|x„) (1) 

for all {yn,Zn,Sn) eJ^xZxiS, xG A"", and (y,s, z) g 3^"""^ x 5"^^ x Z"^^. The schematic is shown in 
Figure [ijb). We note that the broadcast channel with noiseless feedback from both receivers is a special case 
with S = y X Z, and Sn — [Yn, Zn). 

Definition 2.1. An {n, Mq, Mi, M2) transmission system for a given broadcast channel with generalized feed- 
back consists of 

• A sequence of mappings for the encoder: 

em ■■{1,2,..., Mo} X {1, 2, ... , Ml} x {1, 2, . . . , M2} x S""-^ ^ X, m = l,2,...,n, (2) 

• A pair of decoder mappings: 

gi-.y^ ^{1,2,..., Mo} X {1, 2, ... , A/i}, g^ : Z" ^ {1,2, . . . , Mo} x {1, 2, . . . , M2}. (3) 

Remark: Though we have defined the transmission system above for feedback delay 1, all the results in 
this paper hold for feedback with any finite delay k. 

We use Wo to denote the common message, and Wi,W2 to denote the private messages of decoders 
1 and 2, respectively. The messages (Wo,VFi,W2) are uniformly distributed over the set {1, 2, . . . , A/q} x 
{1,2,..., Ml} X {1,2,..., M2}. The channel input at time n is given by X„ = e„(Wo, VFi, 1^2, S*""^). The 
average error probability of the above transmission system is given by 



^= MM M EEE^^((gi(y")^g2(^")) ^ {{k,^)Ak,J)) \ (W^o, W^i, ^^2) = ik,i,j)). (4) 
" 1 2 fc=i i=i j=i 

Definition 2.2. A triple of non-negative real numbers {Ro, Ri, R2) is said to be achievable for a given broad- 
cast channel with feedback ifie > 0, there exists an N{e) > such that for all n > N(e), there exists an 
(n, Mo, Ml, M2) transmission system satisfying the following constraints: 

-logMo > i?o-e, -logAfi > i?i -e, ^ log M2 > i?2 - e, t < e. (5) 
n n n 



5 



The closure of the set of all achievable rate pairs is the capacity region of the channel. 

We now define the structure for tlie joint distribution of all the variables in our coding scheme. Due to the 
block-Markov nature of the scheme, the random variables carrying the resolution information in each block 
depend on the variables corresponding to the previous block. In order to obtain a single-letter rate region, 
we need the random variables in each block to follow the same joint distribution, say P. Hence, after each 
block of transmission, we generate the variables for the next block using a Markov kernel Q that has invariant 
distribution P. This will guarantee a stationary joint distribution P in each block. 

Definition 2.3. Given a broadcast channel with feedback {X,y,Z,S,PYzs\x), define V as the set of all 
distributions PonUxVxAxBxCxXxyxZxSofthe form 

Pabc Puv\abc Px\abcuv Pyzs\x^ 

where A, B, C, U, and V are arbitrary sets. Consider two sets of random variables {U, V, A, B, C, X, Y, Z, S) 

and {U, V, A, B, C, X, Y, Z, S) each having the same distribution P. For brevity, we often refer to the collection 
{A, B, S) as K, to {A, B, S) as K, and to Ax B x S as K,. Hence 

PuVCkXYZ = PUVCKXYZ = P- 

For a given P £P, define Q{P) as the set of conditional distributions Q that satisfy the following consis- 
tency condition 

PABc{a,b,c)= ^ Qj^Bc\uvKci'^^b,c\u,v,k,c)PuvKciu,v,k,c), y{a,b,c). (6) 

u,v,,k,ceUxVx>CxC 

Then for any P €V and Q e Q{P), the joint distribution of the two sets {U, V, K, C, X, Y, Z) and {U, V, K, C, X, Y, Z) 
is 

^uvKCXYz Qabc\uvck Puvkxyz\abc- (7) 

With the above definitions, we have the following theorem. 
Theorem 1. Given a broadcast channel with generalized feedback {X ,y , Z ,S , Py zs\x) , for any distribution 
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P e "P and Q G Q{P), the convex hull of the following region is achievable. 



Ro < Tain{TuT2, (8) 

Ro + Ri< I{UAC: YYA\C) - I{VK; AC\UC) (9) 

i?o + i?2 < KVBC; ZZB\C) - I{UK; BC\VC) (10) 

Rq + Ri+R2< I{UAC; YYA\C) - I{VK; AC\UC) - T (11) 

+ I{V; C\C) + I{VB; ZZB\CC) - I{l}kA- B\CVC) 

Ro + Ri+R2< KVBC; ZZB\C) - I{UK; BC\VC) - T (12) 
+ I{U; C\C) + I{UA-, YYA\CC) - I{VKB; A\CUC) 

2Ro + i?i + i?2 < I{UAC; YYA\C) - I(VK; AC\UC) - T (13) 
+ I{VBC; ZZB\C) - I{Uk; BC\VC) - I{A; B\CCUVk) 



where 



r = H{U\AC) + H{V\BC) - H{UV\ABC) 
Ti = I{AC; YYA\CU) - I{VK; AC\CU) 
T2 ^ I{BC; ZZB\CV) - I{Uk; BC\CV) 

% A /(^c*. YYA\CU) - I{VK; AC\CU) + I{B; ZZB\CVC) - I{UkA; B\CCV) 
Ti = I{A; YYA\CUC) - I{Uk; BC\CV) + I(BC; ZZB\GV) - I{VKB; A\CCij) 

I(AC; YYA\CU) - I{Vk; AC\CU) + I{BC; ZZB\CV) - I(UK; BC\CV) - I{A; B\CCUVK) 



15 - 



Proof. This theorem is proved in Section [Sj 
Remarks: 

1. The input mapping Px\abcuv in the set of distributions V can be assumed to be deterministic, i.e, 
X = f{A,B,C,U,V) for some function /. This is because for a fixed Pabcuv, optimizing the rate 
region is equivalent to maximizing a convex functional of Px\abcuv ■ Hence the optimum occurs at one 
of the corner points, which corresponds to a deterministic Px\abcuv- 

2. We can recover Marton's achievable rate region for the broadcast channel without feedback by setting 
A = B — (j), and C = W with Qc^jjyxc ~ 



3 Coding scheme 

In this section, we give an informal outline of the proof of Theorem [T] The formal proof is given in Section [5] 
Let us first consider the case when there is no common message (i?o = 0). Let the message rate pair (i?i,i?2) 
lie outside Marton's achievable region [l]. The coding scheme uses a block-Markov superposition strategy, 
with the communication taking place over L blocks, each of length n. 

In each block, a fresh pair of messages is encoded using the Marton coding strategy (for the BC without 
feedback). In block I, random variables U and V carry the fresh information for receivers 1 and 2, respectively. 
At the end of this block, the receivers are not able to decode the information in ([/, V) completely, so we send 
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'resolution' information in block {I + 1) using random variables {A,B,C). The pair {A,C) is meant to be 
decoded by the first receiver, and the pair {B, C) by the second receiver. Thus in each block, we obtain the 
channel output by superimposing fresh information on the resolution information for the previous block. At 
the end of the block, the first receiver decodes {A,C), the second receiver decodes {B,C), thereby resolving 
the uncertainty about their messages of the previous block. 

Codehooks: The B-, and C-codebooks are constructed on the alphabets A, B, and C respectively. 
The exact procedure for this construction, and the method for selecting codewords from these codebooks will 
be described in the sequel. Since (A, C) is decoded first by receiver 1, conditioned on each codeword pair 
corresponding to the A- and C-codebooks, we construct a U-codebook of size 2"^'i by generating codewords 
according to Pu\ac- Similarly for each codeword pair in the B- and C-codebooks, we construct a V-codebook 
of size 2"^^ by generating codewords according to Pv\bc- Each U-codebook is divided into 2"^^ bins, and 
each V-codebook into 2"^^ bins. 

Encoding: In each block I, the encoder chooses a tuple of five codewords (A/, B;, Cj, Uj, V/) as follows. 
The resolution information for block {I — 1) is used to select (A/, B;, Ci) from the A-, B- and C-codebooks. C; 
determines the U- and V-codebooks to be used to encode the message pair of block I. Denoting the message 
pair by (mi;,m2/), the encoder chooses a [/-codeword from bin mu of the J7-codebook and a ^-codeword 
from bin m2i of the y-codebook that are jointly typical according to Puv\abc- This pair of jointly typical 
codewords is set to be (U; , V; ) . 

By standard joint-typicality based covering arguments (see e.g., [l6]), this step is successful if the product 
of the sizes of [/-bin and V-hin is exponentially larger than 2''(H(u\ac)+h{v\bc)-h(uv\abc)) ^ Therefore, we 
have 

R[ + R'2-Ri-R2> H{U\AC) + H{V\BC) - H{UV\ABC). (14) 

These five codewords are combined using the transformation Px\abcuv (applied componentwise) to generate 
the channel input X;. 

Decoding: After receiving the channel output of block I, receiver 1 first decodes (Ai,Ci), and receiver 2 
decodes (B;,C/). However, the rates R'nR'2 of the U- and T^-codebooks are too large for receivers 1 and 
2 to uniquely decode U; and Vj, respectively. Hence receiver 1 is left with a list of [/-codewords that are 
jointly typical with its channel output Y/ and the just-decoded resolution information (A;,C;); receiver 2 
has a similar list of V^-codewords that are jointly typical with its channel output Z/, and the just-decoded 
resolution information (B,, C;). The sizes of the lists are nearly equal to 2"(^'i-^('^;^I^C')) and 2"(^2-^(^^;^|sc))^ 
respectively. The transmitter receives feedback signal S; in block /, and resolves these lists in the next block 
as follows. 

In block (/ -I- 1), the random variables of block I are represented using the notation ~. Thus we have 

U(+i = U/, V;_|_i — V(, C; + i = C;, A; + i = A;, B; + i = B; , S; + i = S/. 

The random variables ([/, V, A, B, C, Y, Z, S) in block I are jointly distributed via Pabc Puv\abc Py z s\abcuv 
chosen from V as given in the statement of the theorem. 

Forblock/ + l, (U;+i, Vj+i) = (Ui,V;) can be considered to be a realization of a pair of correlated 'sources' 
([/ and V), jointly distributed according to Pjjv\sabc ^-lo^^g with the transmitter side information given by 
(Ai+i, B;+i, S;+i), and the common side-information C;+i. The goal in block (/ -I- 1) is to transmit this pair 
of correlated sources over the BC, with 
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• Receiver 1 needing to decode U;+i, treating (Aj+i, Yj+i, Cz+i) as receiver side-information, 



• Receiver 2 needing to decode V;+i, treating (B;_|_i, Z;+i, C(^i) as receiver side-information. 



We use the ideas of Han and Costa [8] to transmit this pair of correlated sources over the BC (with appropriate 
extensions to take into account the different side-information available at the transmitter and the receivers). 
This is shown in Figure [3| The triplet of correlated random variables {A, B, C) is used to cover the sources. 
This triplet carries the resolution information intended to disambiguate the lists of the two receivers. The 
random variables of block (/ + 1), given by {A,B,C) are related to the random variables in block I via 
Qabc\uvcabs^ chosen from Q given in the statement of the theorem. We now describe the construction of 
the A-, B-, and C- codebooks. 

For brevity, we denote the collection of random variables {A, B, S) as K, and (A;, B;, S;) as K; = K;+i. 

Covering the Sources: For each c £ C", a C-codebook 5'c(c) of rate po is constructed randomly from Pq^q- 
For every realization of u e W", c e C", and c € C" , an A-codebook ^'yi(u, c, c) of rate pi is constructed 
with codewords picked randomly according to Pa\ucc- Figure 3, we see that in addition to C, receiver 1 
also has {A, Y) as side-information. However, we do not pick the A-codebook conditioned on these random 
variables since they are not available at all three terminals.) For every realization of v G V", c G C", and 
c G C", a B-codebook \E'b(v, c, c) of rate p2 is constructed with codewords picked randomly according to 

^B\vcc- 

At the beginning of block (/ + 1), for a given realization (Ui+i, V;+i, Kj+i, C/+i), of correlated 'sources', 
and side information, the encoder chooses a triplet of codewords (A;^.!, B;+i, C;+i) from the appropriate 
A-, B- and C-codebooks such that the two tuples are jointly typical according to PijvkcQabc\ukvc- 
channel input X/_|_i is generated by fusing this (A/+i, Bj+i, C;+i) with the pair of codewords (U;+i,V/+i), 
which carry fresh information in block (I + I). 

Now consider the general case when Rq > 0. We can use the random variable C to encode common 
information to be decoded by both receivers. Hence C serves two purposes: it is used to (a) cover the 
correlated sources and transmitter side-information and is thus part of the resolution information, and (b) to 
carry fresh information that is decoded by both receivers. We note that in every block, two communication 
tasks are being accomplished simultaneously. The first is joint source-channel coding of correlated sources 
over the BC, accomplished via {A,B,C); the second is Marton coding of the fresh information, accomplished 
via ([/, y, C) C can be made to assume the dual role of the common random variable associated with both 
these tasks. 

Analysis: For this encoding to be successful, we need the following covering conditions. These are the same 



conditions that appear in the Han-Costa scheme (see 17 Lemma 14.1]), with {U,K) and {V,K) assuming 



^Recall that in Marton's achievable region for the BC without feedback, there is a random variable W meant to be decoded 
by both receivers. 
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Side-Information 
(S, A.B.C) 



Source 1: U ' 
Source 2: V - 

Msgs. Wa,Wi,W2- 



A.B.C 



U,V,C 



X 



CHANNEL 



Encoder 




C.B.Z 



Figure 3: Transmitting correlated sources with side-information at the receivers through {A,B,C), and fresh in- 
formation through U, V, C. C plays the dual role - it is used to cover the correlated sources as well as carry fresh 
information. 

the roles of the two sources being coveredj^ 

Po> I{UKV;C\C) + Ro (15) 

PQ + Pi> I{VK; A\CCU) + I{UKV; C\C) + Rq (16) 

Pa + P2> HUK; B\CCV) + I{UkV; C\C) + Rq (17) 

Po + Pi + P2 > I{Vk; A\CCU) + I{Uk; B\CCV) + /(A; B\UkVCC) + I{UkV] C\C) + i?o (18) 

At the end of block (/ + 1), receiver 1 determines U; = U;+i by finding the pair (Uj+i, Aj+i, Ci+i) using 
joint typical decoding in the composite [/-, and C-codebooks. A similar procedure is followed at the second 
receiver. For the decoding to be successful, we need the following packing conditions. 





Ml 


< I{U; YYA\C) + I{C; YAYU\C) + I{A- YAY\UCC) 


(19) 




Ml 


< I{U; YAYC\C) + I{A; YAY\UCC) 


(20) 


R2 + P0- 


1- P2 


< I{V; ZZB\C) + I{C; ZBZV\C) + I{B; ZBZ\VCC) 


(21) 




\- P2 


< I{V; ZBZC\C) + I{B- ZBZ\VCC) 


(22) 


Po - 


Ml 


< I{C; YAYIJ\C) + I{A; YAY\UCC) 


(23) 


Po - 


1- P2 


< I{C; ZI3ZV\C) + I{B; ZBZ\VCC) 


(24) 




Pi 


< I{A;YAY\UCC) 


(25) 




P2 


< I{B;ZBZ\VCC) 


(26) 



Performing Fourier-Motzkin elimination on equations (14 1, ( TsflS ) and ( T9p6 ), we obtain the statement 
of the theorem. 

To get a single-letter characterization of achievable rates, we need to ensure that the random variables in 
each block follow a stationary joint distribution. We now describe how we ensure that the sequences in each 
block are jointly distributed according to 



PaBC ■ PuV\ABC • Px\ABCUV ' PyZS\X 



(27) 



^Though K = (A, B, S) is included in the covering, it is not required to be explicitly decoded at either receiver. 
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for some chosen Pabc, Puv\abc, and Px\abcuv- 

Suppose that the sequences in a given block are jointly distributed according to . In the next block, these 
sequences become the source pair {U,V), transmitter side-information {A,B,C,S) and the side information 
at the two receivers - {C, A, Y) and {C, B, Z), respectively. To cover the source pair with {A, B, C), we pick a 
conditional distribution Qabc\abcuvs such that the covering sequences are distributed according to Pabc- 
This holds when the consistency condition given by (|6| is satisfied. We thereby ensure that the sequences in 
each block are jointly distributed according to ([27]). Our technique of exploiting the correlation induced by 
feedback is similar in spirit to the coding scheme of Han for two-way channels ;18j|. 

We note that the transmitter side information K = {ABS) is exploited at the encoder in the covering 
operation implicitly, without using codebooks conditioned on K. This is because this side information is only 
partially available at the receivers, with receiver 1 having only {A,Y), and receiver 2 having only {B,Z). 
Hence the coding approach does not depend on any assumptions on the nature of the generalized feedback 
signal S. This is in contrast to communication over a multiple-access channel with feedback, where there is a 
significant difference between noiseless feedback and noisy feedback [l9] . 

4 Special Cases and Examples 

In this section, we obtain a simpler version of the rate region of Theorem [T] and use it to compute achievable 
rates for a few examples. 

4.1 A Simpler Rate Region 

Corollary 4.1. Given a broadcast channel with generalized feedback {X ,y, Z,S, Pyzsix)^ define any joint 
distribution P of the form 

PcoPwuvPx\wuvCoPyzs\x- (28) 

for some discrete random variables W, U, V, Cq. Let (Co, W, U, V, X, Y, Z, S) and (Co, W , U, V, X, F, Z, S) be 
two sets of variables each distributed according to P and jointly distributed as 

PcoWUVXYZS Qco\CoWUVS PwUVXYZS\Ca- (29) 

where Qco\CoWiJVS distribution such that 

Pcoico) = ^ Q^^^^^-^(jyg{co\co,w,u,v,s)P{cQ,w,u,v,s), VcoSCo. (30) 

Co ,w,u,v,s 
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Then the following region is achievable. 



Rn<mm{ri,T2} (31) 
Ra + Ri < IiUW]Y\Co) + IiCo;Y\YCoW) + IiCo\Y\CoWU) ~ IiVS;Co\CoWU) (32) 
Ro + R2 < I{VW- Z\Co) + /(Co; Z\ZCoW) + /(Co; Z\CoWV) - I{US; Co|Co W) (33) 
Rq + Ri+R2< I{UW; Y\Co) + /(Co; Y\YCoW) + I{Ca;Y\CoWU) - I{VS; Ca\CaWU) (34) 

+ I{CoZ; V\CoW) - I(U; V\W) 
Ro + Ri+R2< I{VW] Z\Ca) + /(Co; Z\ZCoW) + /(Co; Z\CoWV) - /(?75; Co|CoW) (35) 
+ I{CoY: if\CoW) - I{U; V\W) 
2Ro + Ri+ R2 < I{UW- Y\Co) + I{Co; Y\YCoW) + I{Co;Y\CqWU) - I{VS; Co|CoW^J7) (36) 
+ I{VW; Z\Ca) + /(Cq; Z\ZCoW) + /(Co; Z\CoWV) - I{US; CojCol^V^) - I{U; V\W) 

where 

Ti = IiCa;Y\CoWU) + /(CoTF; Y\YCoWU) - I{VS; Co|CoW) 
T2 ^ /(Co; Z|Co W) + I{CoW; Z\ZCoWV) - I{US; Co|Co W) 

Proof. In Theorem [1] set A = B = 0, and C = (Co, VF), with 

Qc\CUVS — QcoW\CoWUVS — ^wQc'olCoWUVS- 

For this choice, we have Qc\cuvs ^ 2(^) ("^Ol is satisfied. □ 



4.2 The AWGN Broadcast Channel with Noisy Feedback 



We now use CoroUary |4.1| to compute achievable rates for the scalar AWGN broadcast channel with noisy 
feedback from one receiver. We compare the obtained sum rate with: a) the maximum sum rate in the absence 
of feedback, b) the achievable region of Bhaskaran for the case with noiseless feedback from one receiver, 
and c) the achievable region of Ozarow and Leung [s] for noiseless feedback from both receivers. We note that 
the coding schemes in both [6j and [s] are linear schemes based on Schalkwijk-Kailath coding for the AWGN 
channel [20|, and cannot be used when there is noise in the feedback link [2l]. Our rate region also includes 
the possibility of a common message to both receivers. The coding schemes of [S] and [6] are constructed only 
for private messages. 

The channel, with X = y = Z = R, is described by 

Y = X + Ni, Z = X + N2, (37) 

where Ni , A'2 are Gaussian noise variables with zero mean and covariance matrix 



1 P 
P 1 



where p € [—1, 1]. Ni and N2 are independent of the channel input X. The input sequence x for each block 
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satisfies an average power constraint X)r=i — 

In the absence of feedback, the capacity region of the AWGN broadcast channel is known 22 23 and can 
be obtained from Marton's inner bound using the following choice of random variables. 



V = VaPQ2, U^VaPQi 



aP 



aP 



where a G (0, 1), and Qi,Q2 are independent A/'(0, 1) random variables. The Marton sum rate is then given 

by 

i?no-FB = I{V; Z) + I{U- Y) - I{U- V)^]^ log2 f 1 + 5 V (38) 



This is essentially the 'writing on dirty paper' coding strategy 24 25 : for the channel from U ioY , V can 



be considered as channel state information known at the encoder. We note that an alternate way of achieving 
the no- feedback capacity region of the AWGN broadcast channel is through superposition coding [2] . |^ 

Using Corollary 4.1 we now compute an achievable region for the channel (37 1 with noisy feedback from 



transmitter 1 alone. The feedback signal is given by 



S^Y + Nf 



(39) 



where Nf is additive white Gaussian noise on the feedback link distributed as A/'(0, a^). Nj is independent of 
X,Y,Z,Ni and N2. 

To motivate the choice of joint distribution, let us first consider the case of noiseless feedback, i.e., Nf — 0. 



Noiseless Feedback: The joint distribution Pc^PuyPx 



\CoUV 



is chosen as 



V = ^/aPi Q2, U ^ ^/aPi Qi+/3V 
X - VP -Pi Co + Q2 + v/a^ Qi 



(40) 
(41) 



where Qi, Q2, Co are independent Gaussians with zero mean and unit variance and a, € (0, 1), Pi G (0, P) 
are parameters to optimized later. 

Next we define a conditional distribution Qca\CoUVYZ that satisfies (30). Let 



Ti = 



U-E[U\YCo] 
E[{tj ^ E[U\YCo]Y] 



(42) 



Then define Q, 



Co\CoUVYZ 



by the relation 



Co = f 1 + c 



(43) 



where C is a A/'(0, D) random variable independent of (Co, U , V, Y, Z). 

In words, Ti is the normalized error in the estimate of U at receiver 1. This estimation error is quantized at 
distortion level D and suitably scaled to obtain Cq. Thus, in each block, Cq represents a quantized version of 
the estimation error at receiver 1 in the previous block. If we similarly denote by T2 the error in the estimate 
of V at receiver 2, then T2 is correlated with Ti. Therefore, Cq simultaneously plays the role of conveying 



^Theorem [T] was established for a discrete memoryless broadcast channel with feedback. These theorems can be extended to 
the AWGN broadcast channel using a similar proof, recognizing that in the Gaussian case superposition is equivalent to addition. 
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Figure 4: Achievable sum rates for the AWGN broadcast channel with n oisy feedback from receiver 1. Noise correlation 
p = 0. The three solid lines show the sum-rates computed using CoroUaryk.llfor feedback noise variance ct^ = 0, ; (t'^/10, 
and . The dashed line at the bottom is the no- feedback sum rate, the dotted line in the middle is the sum-rate of 
the Bhaskaran scheme, and the * symbols at the top are the sum rate of the Ozarow-Leung scheme. 



information about T\ to receiver 1, and about to receiver 2. With this choice of joint distribution, the 
information quantities in Corollary |4. 1| can be computed. 

Noisy Feedback: When the feedback is noisy, the transmitter does not know Y , and so cannot compute 
U — £'[[/|l"Co] in (42) which was used to generate Cq. Instead, the transmitter can compute an estimate of 
the error at receiver 1. We now define Ti as 



Ti = 



(44) 



where 



A = S 



{U-E[U\YCa]) I UVCqS 



1 ^ 



Pi+a^ 



Qi 



^ Pa^ - apP, ^ Pi{a + f3a) a' ~. (45) 



Pi + <y^ 



Pi + 0-2 0-2+0- 



As before, Co is defined by (43) with Ti given by (44), and the input X is defined by (41 1. With this choice 
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Figure 5: Variation of the sum-rate vs correlation coefficient of {Ni,N2). P/a^ — 10 and there is noiseless feedback 
from receiver 1. The dashed line shows the no- feedback sum-rate. 

of joint distribution, the information quantities required to evaluate Corollary |4.1| are computed and listed in 
Appendix |Aj 

For different values of the signal-to- noise ratio P/a^ , feedback noise variance and correlation coefficient 
p, we can compute the maximum sum rate by numerically optimizing over the parameters {a, p, D, Pi). For 
the case where the noises at the two receivers are independent (p = 0), the maximum sum rate is plotted 
in Figure for aj: — a'^.j^ and (ctj = is noiseless feedback). The figure also shows the sum rate in the 
absence of feedback, the sum rate of the Bhaskaran scheme [6] for noiseless feedback from one receiver, and 
the maximum sum rate of the Ozarow-Leung scheme with noiseless feedback from both receivers. 

We see that the obtained sum rate is higher than the no-feedback sum rate even with feedback noise variance 
crj — a^, and increases as decreases. We also observe that for cr^ = (noiseless feedback), the sum rate of 
the proposed rate-region is higher than the Bhaskaran sum rate for high SNR. Concretely, for P/cr^ = 10, 100 
and 1000, our region yields sum rates of 1.842, 3.612 and 5.378, respectively; the Bhaskaran sum rates for these 
SNR values are 1.852, 3.452 and 5.105. The Ozarow-Leung scheme yields higher sum rates than the proposed 
region, but we emphasize that it uses noiseless feedback from both receivers. Another difference is that both 
the Ozarow-Leung and Bhaskaran schemes are specific to the AWGN broadcast channel and do not extend to 
other discrete memoryless broadcast channels, unlike the scheme in this paper. 

Figure [5] shows the effect of p (the correlation coefficient of TVi, A'2) on the noiseless feedback sum-rate with 
P/a^ held fixed. Note that the sum- rate without feedback does not change with p as long as the individual 
noise variances remain unchanged [12]. We observe that the sum-rate decreases monotonically with the noise 
correlation and is equal to the no-feedback rate at p = 1. This is consistent with the fact that feedback does 
not increase the capacity of the AWGN broadcast channel with p — 1 since it is physically degraded (in fact, 
we effectively have a point-to-point channel when p = 1). 
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4.3 Comparison with the Shayevitz-Wigger (S-W) Rate Region 

An achievable rate region for the broadcast channel with feedback was independently proposed by Shayevitz 



and Wigger 15 . Their coding scheme can be summarized as follows. In the first block, the encoder transmits 
at rates outside than the Marton region. The receivers cannot decode, and as discussed earlier, the information 
needed to resolve the ambiguity at the two receivers is correlated. This resolution information is transmitted 
in the next block through separate source and channel coding. The correlated resolution information is 
first quantized into three parts: a common part, and a private part for each receiver. This quantization 
is performed using a generalization of Gray-Wyner coding [26| . The quantization indices representing the 
correlated information are then transmitted together with fresh information for the second block using Marton 
coding. 

While the S-W scheme is also a block-Markov superposition scheme with the Marton coding as the starting 
point, the S-W scheme differs from the one proposed in this paper in two aspects: 

1. Separate source and channel coding 

2. Backward decoding 

While separate source and channel coding can be considered a special case of joint source-channel coding, 
the backward decoding technique in [15| uses the resolution information in a different way than our scheme. 
In particular, the covering random variables in each block are decoded first and serve as extra 'outputs' at 
the receivers that augment the channel outputs. This difference in the decoding strategy makes a general 
comparison of the two rate regions difficult. 

In AppendixjB] we show that the class of valid joint distributions for the S-W region can be obtained using 
our coding scheme via a specific choice of the covering variables {A,B,C). The rate region of Theorem [l] 



evaluated with this class of distributions is given in (81 1. We observe that the bounds on i?o + ^i, + ^2, 



Ro + Ri+ i?2 and 2i?o -I- i?i -I- i?2 are larger than the corresponding bounds in the S-W region. However, our 
region has an additional Rq constraint which is not subsumed by the other constraints. Therefore a general 
statement about the inclusion of one region in the other does not seem possible. In the following, we focus on 



the two examples discussed in 15 and show that the feedback rates of the S-W region can also be obtained 
using Corollary |4.1[ 

The Generalized Dueck Broadcast Channel 

This is a generalization of the Dueck example discussed in Section [T] The input X is a binary triple 
(Xq, XX2). The output of the two receivers 1 are F = {Xa + No,Xi + Ni) and Z = {Xq + Nq, Xi + N2) 
where (Nq, Ni, N2) are binary random variables with distribution Pno.Ni,N2 such that 

H{No,Ni)<l, HiNo,N2)<l. 

We evaluate the rate region of Corollary |4.1| for noiseless feedback from receiver 1 with the following joint 
distribution. 

{W, U, V) ^PwPuPv with Pw, Pu, Pv - Bernouni(^), 

Qco\CoWUVY--Co = Y(SU^m, (46) 
X : {Xo,Xi,X2) = iW,U,V) 
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With this choice of Q, Cq is a BcrnouUi random variable with the same distribution as A^i. With the joint 
distribution above, the mutual information quantities in Corollary |4.1| can be computed to be 



I{UW; Y\Co) = 2 - H{N„,Ni), I{VW; Z\Co) = 2 - H{No, N2), I{Co; Y\YCoW) = /(Co; Z\ZC„W) = 0, 
I{Co:Y\CoWU)=H{Ni), I{Co; Z\CoWV) = H{Ni) - H{Ni\No, N2), I{CoY;U\CoW) = 1, 
liCoZ; V\CoW) = 1 - HiN2\NoNi), liVS; Co\CoWU) = liUS; Co\CoWV) = i/(iVi), 
liCoW; Y\YCoWU) = liCoW; Z\ZCoWZ) = 1 - H{Na). 



The rate region is given by 



i?o <1-H{No)~H{Ni\Nq,N2) 
Ro + Ri <2-H{Na,Ni) 
R0 + R2 <2-H{No,N,,N2) 
i?o + i?i + i?2 < 3 - H{No, Ni,N2) 



(47) 



The roles of i?i,i?2 in (47) can be exchanged by choosing Cq = Z (B V = N2. Thus the following feedback 



capacity region obtained in |15| is achievable. 

R,<2-HiNo,N,), R2<2-H{No,N2), Ri + R2 < i - HiNo, N,, N2). 
The Noisy Blackwell Broadcast Channel 

This generalization of the Blackwell channel has ternary input alphabet X — {0,1,2}, binary output 
alphabets y = Z = {0,1} and channel law given by 



Y = 



N 



X = 



Z = 



N 



X = 0,1 



^ 1- N X = 1,2 [ 1- N X = 2 

where N ^ Bernoulli(p) is a noise variable independent of X. With noiseless feedback from both receivers. 



the rate region obtained in 15 can also be obtained using Corollary 4.1 with the following joint distribution. 

1 



Pw{0) = Pw{l) 



2' 



Puv\wiO,m^O)=a, Puviw{lA\W = 0)=p, Pcy|H.(l, OIPF = 0) = 1 - « - /?, 

Puv\wiO.O\W = l) = P, Puv\wihMW = l)=a, Puviw{l,0\W = 1) = 1 - a - p, (48) 

X = u + v, 

QcolCoWUVY ■.Co = Y®U = Z®V = N. 



With h{.) denoting the binary entropy function and x -k y = x{l — y) + y(l — x), the mutual information 
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quantities in Corollary |4.1| are 



I{UW; Y\Co) = I{VW: Z\Co) = h[p* 



a + /3 



hip) 



I{Co;Y\YCaW) = /(Co; Z\ZCoW) = 
I{VS;Co\CoWij) = I{iJS;CQ\CoWV) = h{p) 

I{CnZ; V\CoW) - I{U; V\W) = H{V\UW) = ^ (^$h (^^ 

I{CoY; U\CoW) - I{U; V\W) = H{U\VW) = ^ (^^h (^^ 

a + p 



ah 
ah 



I{W; Y\YCoWU) = I{W; Z\ZCoWZ) = /i p ★ 



The rate region is then given by 



1 , f ap + ap 



Ro<h[p.^l)-lh{^^)-lh^^P^^^ 
Ro + Ri< h[pi.'^4r^] -h{p) 



Ro + R2<h[p-k 



2 

a + /3 



h{p) 



Ro + Ri+R2<h[p 



a + 13 



a 



For i?o = 0, this matches the rate-region obtained in 15 for this channel 



I f f3p + /3p 
2 



Hp) 



(49) 



5 Proof of Theorem [T] 



5.1 Preliminaries 



We shall use the notion of typicality as defined in [17 27 . Consider finite sets Zi , Z2 and any distribution 
PziZ2 on them. 

Definition 5.1. For any e > 0, the set of jointly e-typical sequences with respect to PziZz *s defined as 

1 



4"H^z,zJ-|(zi,Z2): 



-N{a,b I Zi,Z2) - PziZ2(a, 6) 



< ePz^Z2 (a, b), for all (a, b) e Zi x Z2 



where N(a,b \ zi,Z2) is the number of occurrences of the symbol pair (a, 6) in the sequence pair (zi,Z2). 
For any zi G Zi, define the set of conditionally e-typical sequences as 

^r(Z2|zi) = {z2:(zi,Z2)e^I'(Pz,zJ}. 



The following are some basic properties of typical sequences that will be used in the proof. S{e) will be 
used to denote a generic positive function of e that tends to zero as e — > 0. 

Property 0: For all e > 0, and for all sufficiently large n, we have P^i Z2['^^"''(Pzi,Z2)] > 1 — e- 
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Property 1: Let zi e Al (Pzi) for some e > 0. If Z2 is generated according to the product distribution 
niLi Pz:2\Zi{-\zu), then for all e' > e 

lim Pr[(zi,Z2)e4"^(PzizJ] = l. 

n— ^oo 

Property 2: For every Zi e , the size of the conditionally e-typical set is upper bounded as 

If zi e ^^"(P^J, then for any e' > e and n sufficiently large 

|^r(2'2|zi)| > 2"(^(^=l^i)-'^('^'». 
Property 3: If (zi,Z2) G ^^'(Pzi.zJ, then 

2-n(ff(Z2|Zi)+5(e)) < P^, | (z2 |zi ) < 2-"(^(^2 I . 

The definitions and properties above can be generalized in the natural way to tuples of multiple random 
variables as well. 

5.2 Random Codebook Generation 

We recall that K denotes the collection {A, B, S), and K. denotes the set .4 x x 5. 

Fix a distribution Puvabcxyzs from V and a conditional distribution QabciOvkc satisfying ([6]), as 
required by the statement of the theorem. Fix a positive integer L. There are L blocks in encoding and 
decoding. Fix positive real numbers R'l, R'2, Rq,Ri, R2, po, pi and p2 such that R'l > Ri and i?2 > ^2, where 
these numbers denote the rates of codebooks to be constructed as described below. Fix block length n and 
e > 0. Let Z = 1, . . . , L be numbers such that e < ei < £2 < ■ • ■ < cl- 
For / = 1, 2, 3, . . . , L independently perform the following random experiments. 

• For each sequence c G C", generate 2"''" sequences C^/ ,j gj, i = 1, 2, . . . , 2^p° , independently where each 
sequence is generated from the product distribution YVi=i ^c|(5('l^«)- 

• For each sequence pair (c,a) g C" x A"', generate 2"(^'i~^i) sequences U[;_j_c,a]7 i = 1,2, . . . ^2"(^'i^^i\ 
independently where each sequence is generated from the product distribution n"=i Pu\Ac{'\o.i, Ci). Call 
this the first [/-bin. Independently repeat this experiment 2"^^ times to generate 2"^^ J7-bins, and a 
total of 2"^i sequences. The ith sequence in the jth bin is V[i_(^j-i)2"'^i+i,c,a.]- 

• For each sequence pair (c,b) e C" x S", similarly generate 2"^^ y-bins each containing 2"(^2-fi^2) 
sequences with each sequence being generated from the product distribution Y[i=i Pv\Bci'\bii '^i)- The 
ith sequence in the jth bin is Vj; (j_]^-)2"^2+i. c.b]- 

• For each (u, c, c) e U"^ xC"- xC"- generate independently 2"^^ sequences A[;^j_u_5_c] , for i = 1, 2, . . . , 2"-^^ , 
where each sequence is generated from YVj=i ^A\ucc('\'^j^ ^i' "^i)- 

• For each (v, c, c) e V" x C" x C" generate independently 2"''=^ sequences ^ii^i,v,c,c] , ior i — 1,2, . . . , 2"''^ , 
where each sequence is generated from YVj=i ^B|\/cc('l^i' ^i' ''i)- 
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• For each (a, b, c, u,v) G A'^ x xC" x Z^" x V" generate one sequence X[;_a,b,c,u,v] using 
nr=i Px\ABCUvi-\ai, bi, Uj, Vi). 

• Generate independently sequences U[0], V[0], C[0], K[0], X[0], Y[0], Z[0] from the product distribution 

pn 

^U,V,C,K,X,Y,Z- 

These sequences are known to all terminals before transmission begins. 
5.3 Encoding Operation 

Let Wo[l] denote the common message, and Wi[/], W2[Z], the private messages for block I. These are indepen- 
dent random variables distributed uniformly over {0, 1, ... , 2"-"" - 1}, {1,2,..., 2"^i}, and {1,2,..., 2"^^}, 
respectively We set Wo[0] = Wi[0] = W2M = Wo[L] = Wi[L] =W2[L] = l. 

For each block /, the encoder chooses a quintuple of sequences (A[^], B[/], C[^], U[Z], V[/]) from the five 
codebooks generated above, according to the encoding rule described below. The channel input, and channel 
output sequences in block I are denoted X[Z], Y[l] and Z[l], respectively. 

Blocks Z 1, 2, 3, . . . , L : The encoder performs the following sequence of operations. 

• Step 1: The encoder determines a triplet of indices Ga[1] e {1, . . . , 2"''i }, Gb[1] G {I, . . . ,2"-p^}, and 
Gc[l]) e {l,...,2"'"'} such that 

1. Gc[l] mod 2"-"« = Wo[/],0and 

2. The tuple (U[^ — 1], — 1], K[/ — 1], C[Z — 1]) is jointly ej-typical with the triplet of sequences 

(C[;,Gc[i],C[/-l]], A[,^GA[/],U[/-l].C[i-l],C[,,Gc[il,ci,-ii]]' ^[^Gb [;],V[;-l],C[i-l], C[,.Gc. [,j ,c(!-ii]] ) ' 

with respect to Pijy,K,c,c,A,B^ 
If no such index triplet is found, it declares error and sets {Ga[1], GbI^], Gc[l]) — (1, 1, 1). 
The encoder then sets 

= C[,^Gp[,]^c[;-i]], A[/] = A[i^G^[,]^u[;-i],c[;-i], C[,,Gc[i].c[i-ii]l' ^[^1 = B[i,GB[i],v[i-i],c[i-i], C[,,gcW,c[!-i]i]- 

• Step 2: The encoder chooses a pair of indices (G[/[^], Gv'[Z]) such that the triplet of sequences 

(U[;,Gc/[i],c[i],A[/]], V[i_GvW,c[;],B[;]], A[;], B[l], C[l]) 

is e-typical with respect to Puvabc, and 'V[i^GuIi],c[i],a[i]] belongs to the J7-bin with index Wi[^], and 
^[i.Gv[i],c[i].B[i]] belongs to the y-bin with index W2[']. If no such index pair is found, it declares error 
and sets {Gu[l],Gv[l]) = (1,1)- 

The encoder then sets U[;] = U[i_G^[/]_c[;],A[;]], V[?] = V[;_Gi.[;],c[/],b[;]], and X[/] = X[i_A[(],B[i],c[;],u[;],v[;]] ■ 
It transmits X[?] as the channel input sequence for block /. 

• Step 3: The broadcast channel produces (Y[^],Z[/]). 

• Step 4: After receiving (S[/]) via the feedback link, the encoder sets K.[l] ~ (A[^],B[/], S[^]). 

^This condition corresponds to the role of C in carrying the message Wo[i] common to both receivers. 
^ If there is more than one triplet satisfying the conditions, the encoder chooses one of them at random. 
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5.4 Decoding Operation 

Block 1: The objective at the end of this block is to decode the common message Wo[l] at both receivers. 

• The first decoder receives Y[l], and the second decoder receives Z[l]. 

• The first decoder determines the unique index pair (Gci[l], such that the tuples 

(C[0],A[0],U[0],Y[0]) and (Ci[l], A[,^^^[ij_u[oj^c[o],c.[i]]> Y[l]) 

are jointly e^-typical with respect to PcAuycav where Ci[l] = Cj^ Gci[i] c[o]]- Note that Ci[l] is the 
estimate of C[l] at the first decoder. 

If not successful in this operation, the first decoder declares an error and sets (Gci[l], = (1, 1), 

and Ci[l] = Cj^ (5^^j^ 

• The first decoder outputs Wo[l] = Gci[l] mod 2"^°, and sets 

= u[o],c[o],Ci[i]]- 
A[l] is the first decoder's estimate of A[l]. 

• The second decoder determines the unique index pair (Gc'2[l], G_b[1]) such that the tuples 

(C[0], B[0], V[0], Z[0]) and (Csfl], B[i,Gb[i],v[o],c[o],c.[i]])' Z[1]) 

are jointly e;-typical with respect to Pcbvzcbz' where C2[l] = Cj-^ Gc2[i] C[o]]- -l^ote that C2[l] is the 
estimate of C[l], at the second decoder. 

If not successful in this operation, the second decoder declares an error and sets (G'c2[l], G_b[1]) = (1, 1), 
and C2[l] = Cj^ c[o]]- 

• The second decoder outputs Wo[l] = Gc2[l] mod 2"^", and sets 

= B[i_Gg[i],v[0],C[0],C2[l]]- 

B[l] is the second decoder's estimate of B[l]. 

Block 1,1 = 2,3, L: The objective at the end of block I is for receiver 1 to decode (Wo[Z], — 1]) and 
for receiver 2 to decode (Wo[^], W2[i — 1]). 

• The first decoder receives Y[l] and the second decoder receives Z[l]. 

• The first decoder determines the unique index triplet {Gci[l], Ga[1], Gu[l — 1]) such that the tuples 

(Ci[/-l],A[Z-l],U[Z-l],Y[Z-l]) and (Ci[/], A[;_g.H,u[/-i],c,[i-i],c.[;]]' Y[/]) 
are jointly e^-typical with respect to PcaOycay^ where 

U[Z - 1] = ^[(^l-l)^Gu[l-l],Ci[l-llA[l-l]]^ = Cj, 
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If not successful in this operation, the first decoder declares an error and sets {Gci[l]iG a[1]tGu[1 — 1]) = 
(1, 1, 1), and 

U[/ - 1] = U[(;_i-)j_Ci[;-i],A[(-i]]; = C[is,Ci[i-i\]- 

Note that U[Z — 1] and Ci[Z] are the estimates of — 1] and C[/], respectively, at the first decoder. 

• The first decoder then outputs Wq[1] ~ Gci^] rnod 2"^°, and Wi[l — 1] as the index of J7-bin that 
contains the sequence Uj^-j -^^ G[/[i-i] Ci[;-i] A[i-i]]- "^^^ decoder sets 

Affl = \LGA[l].XJ[l-l\,C^[l-l\,C^m■ 

A[l] is the first decoder's estimate of A[/]. 

• The second decoder determines the unique index triplet {Gc2[l], Gb[1]i Gv[l ~ 1]) such that the tuples 

(C2[Z-1],B[/-1],V[/-1],Z[/-1]) and (C^l^], Bj,_^^[,,_^[,_,j e,[,-i],c. W] > ZW) 
are jointly ei-typical with respect to Pcbvzcbz^ where, where 

Y[l - 1] ^ ^[Q-i)^Gv[l-l\,C2[l-l\M-^W' = ^[1,GC2[IIC2[1-IW 

If not successful in this operation, the second decoder declares an error and sets {Gc2\1]tG b\1]iGv[1 — 
1]) = (1,1,1), and 

Note that V[Z — 1] and C2[l] are the estimates of V[/ — 1] and C[/], respectively, at the second decoder. 

• The second decoder then outputs Wq [I] = Gc2 W niod 2"^° , and W2 [I — 1] as the index of V-bin that 
contains the sequence Vj(.;_j^j Gv[i-i] C2[i-i] b[(-i]]- decoder sets 

= ^[1,Gb[1].V[1-1],C2[1-1],C2[1]]- 

B[/] is the second decoder's estimate of B[^]. 
5.5 Error Analysis 

Let £"[0] denote the event that (U[0],K[0], V[0], C[0]) is not e[0]-typical with respect to Pjjkvc- By Property 
0, we have Pr[f [0]] < e for all sufficiently large n. 

Block 1: The error event in Block 1 can be expressed as £[1] = U ^2(1] U ^3(1] U £a[1\ U £5(1] where 



£1 [1] is the event that the encoder declares error in step 1 of encoding (described in Section |5.3| , 
£2[1] is the event that the encoder declares error in step 2 of encoding, 

£3[1] is the event that the tuples (U[0], V[0], K[0], C[0]) and (U[l], V[l], K[l], C[l]) are not jointly ei- 
typical with respect to Puvkcuvkc^ 

£41] is the event that (Gci [1], ^^[1]) ^ {Gc[1],Ga[1]), and fgll] is the event that (Gc2[l], G'b[1]) ^ 
{Gc[1],Gb[1]). 
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Lemma 5.1 (Covering lemma). Pr'[£i[l] | £[0^] < e for all sufficiently large n if Rq, Pq,Pi, and p2 satisfy 



Po> I{UVK;C\C)+Ro + S{ei) (50) 

Po + Pi> HVK; A\CCU) + I{UVk- C\C) + Ro + <5(ei) (51) 

Po + P2> I{Uk- B\CCV) + I{UVk- C\C) + Ro + S{ei) (52) 

Pt)+Pi+P2> I{Vk; A\CCU) + I{Uk; B\CCV) + I{A; B\UVkCC) + I{UVk; C\C) + Rq + S{ei) (53) 

Proof. The proof of this covering lemma is the same as that of |17, Lemma 14.1], with {U,K) and (V,K) 

assuming the roles of the two sources being covered. □ 

Lemma 5.2. Pr[£2[l] | ^[0]'^] < e for all sufficiently large n if R[,R2, and i?i,i?2 satisfy 

R[ + i?2 - i?i - i?2 > H{U\AC) + H{V\BC) - H{UV\ABC) + 5{ei) (54) 



Proof. This is very similar to a standard covering lemma used for bounding the probability of encoding error 
in Marton's coding scheme, a proof of which can be found in [2l[T6] or [Tt]. □ 



From Property 1 of typical sequences, it follows that Pr[f3[l] | fi[l]'^,f2[l]^j^^[0]'^] < e for all sufficiently 
large n. 

Lemma 5.3. Pr[£i[l]\J £^[1] \ £:i[lY , £2[lY , £i[lY , £W] < 2e, if 

PQ + Pi< I{C\ YAYU\C) + /(A; YAY\UCC) - 3(5(ei) (55) 

po + P2 < /(C; ZBZY\C) + 1(B; ZBZ\VCC) - 3(5(ei) (56) 

Pi < /(A; YAY\l]CC) - 3(5(ei) (57) 

P2 < /(B; ZI3Z\VCC) - 3S{ei) (58) 

Proof. The proof is a special case of that of Lemma |5.4| given below. □ 



Hence P[f [1] | ^[0]^] < 5e if the conditions given in Lemmas |5.1 5.2 and 5.3 are satisfied. This implies 



that A[l] = A[l], Ci[l] = C2[l] = C[l], and similarly B[l] = B[l] with high probability. 

Block I, I ~ 2,3, . . . , L: The error event in block I can be expressed as £[l] — U^^i£i[l] where 

- £i [I] is the event that the encoder declares error in step 1 of encoding 

- £2[l] is the event that the encoder declares error in step 2 of encoding, 

- £3[l] is the event that the tuples (U[l ~ l],Y[l ^ l],K[l - l],C[l - 1]) and {'U[l],Y[l],K[l],C[l]) are not 
jointly e;-typical with respect to PfjvKcuvKC^ 

- £i[l] is the event that \{Gci%Ga[1],Gu[1 - ^) + {Gc\l\,G A\l\Gu\i - 1])}, and fsW is the event that 
{ (Gc2 [Z] , Gb [Z] ,Gv{l-\\)^{Gc [/] , Gb [Z] , Gy [/ - 1] ) } . 
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Using arguments similar to those used for Block 1, one can show that if po, pi, P2, R'l, R21 Ri and R2 satisfy 
the conditions given in (54) and (50)-(53) with e[l — 1] replaced with e/, then for all sufficiently large n, 



Lemma 5.4 (Packing lemma). Pr[£4l] U S^il] \ f3W^■£2W^fl[^]^ nj^^Q^ [fc]=] < 2e, if 

R'l + Po + Pi < I{U; YYA\C) + /(C; YAYU\C) + I{A- YAY\UCC) - 35(6;) (59) 

+pi< I{U] YAYC\C) + I{A- YAY\UCC) - 3(5(e,) (60) 

i?2 + Po + /52 < I{V; ZZB\C) + I{C; ZBZV\C) + I{B\ ZBZ\VCC) - SS^ei) (6f) 

i?2 + /52 < I{V; ZBZC\C) + I{B] ZBZ\VCC) - 35(e0 (62) 

Po+Pi< HC; yAyU\C) + I{A; yAy\UCC) - 36{ei) (63) 

Pq + P2< I{C] ZBZV\C) + I{B; ZBZ\VCC) - Z6{ei) (64) 

pi < I{A; YAY\UCC) - 3(5(e/) (65) 

P2 < I{B- ZBZ\VCC) - 3(5(e,) (66) 



Proof. See Appendix [C| 



□ 



Hence Pr[f [Z] | r\^^Jf^E[kY] < 5e. Under the event n[.^of we have A[/] = A[l], Ci[Z] = C2H = C[/], and 



B[/] = B[, 



Overall Probability of Decoding Error : Hence the probability of decoding error over L blocks 
satisfies 

Pr[f] = Pr [uf^o'^W] < 5ei 



if the conditions given in (54), (50)-([53| and (59)-(66) are satisfied with 5{ei) and i5(e;) are replaced with 



9, where 9 = X)i=i'^(^i)- This impHes that the rate region given by (f4), (f5)-(f8), (f9)-(26) is achievable. 
By applying Fourier-Motzkin elimination to these equations, we obtain that the rate region given in the 
statement of the theorem is achievable. The details of this elimination are omitted since they are elementary, 
but somewhat tedious. 



6 Conclusion 

We have derived a single-letter rate region for the two-user broadcast channel with feedback. Using the Marton 
coding scheme as the starting point, our scheme has a block-Markov structure and uses three additional random 
variables {A, B, C) to cover the correlated information generated at the end of each block. The proposed region 
was used to compute achievable rates for the AWGN channel with noisy feedback. In particular, it was shown 
that sum rates higher than the no-feedback sum capacity could be achieved even with noisy feedback to only 
one receiver. 

The key to obtaining a single-letter characterization was to impose a constraint on the Markov kernel 
connecting the distribution of the random variables across successive blocks. A similar idea was used in [19| 
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for multiple-access channels with feedback. This approach to harnessing correlated information is quite general, 
and it is likely that it can be used to obtain improved rate regions for other multi-user channels with feedback 
such as interference and relay channels. 
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APPENDIX 



A Mutual Information terms for the AWGN example 



With the joint distribution described in Section 4.2 we first compute the foUowing quantities 



EiAV] = ^P^^Pf-fP^) (68) 

^ — i^T^^ — + KT^^ '^^^^ 



Pi + a^ 



+ (T 



Pia^{a + al3) ( (j) 



E[/^Y] = ' ' r"^' , ' a . (71) 

We next compute the conditional variances in terms of which the mutual information quantities are expressed. 

v.r(f,|C„^).l-<|M!)! 

M^aPi 

y^r{n\CoU) = 1 - }^p^^T2-p. (73) 
Mu[aPi + p'^aPi) 

varm|CoZ)^l~ j^tff' (74) 



(g[Ay])^ 
i\4(Fi+a2 



var(f 1 ICoF) - 1 - /„ (75) 



var Ti Co^Z = 1 - — _„ , „ ^ — (76) 

Mu \ aPi{aPi +0-2) J 

var(Ti Cot/y = 1 - — , 2\t p , r2-p\ 1 p ^ r- p \2 (^7) 

(78) 



where 

ai = E[/W]{Pi+(t'^) - E[/^Z]aPi, hi ^ E[l^Z]aPi - E[lW]aPi, 
a2 = E[Aiy]{Pi + CT^) _ p[Ay](aPi + /3aPi), 62 = E[/^Y]{aPi + ^^aPi) - S[AC7](aPi + l3aPi] 

Finally, the mutual information terms are calculated to be 



(79) 
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2 \ a 



/(Co; US\CoV) = ^ log2 (^1 + il^var(fi|C'ol/)) , /(Co; f ^jCof/) = ^ logs (l + ^^^var(fi|Co;7)) , 

nCSlCoU) 4 log, f a-^WT.|Coj7) + /^\ ^ 1 f il-D)..rif,\C,VHD\ 

^ ' ^ 2 \^(l-/?)var(ri|CoC/r)+/?; V u, I ; 2 \^(l-7^)var(Ti|CoFZ) + /?y 

7,C.; f>|C.f , ^ 1 log, ( . /(Co; . 1 log. ( " " '')™'<f.l<5oZ) + D \ 



2 \^(l-/?)var(fi|CoC^l^)+/?y/ ' v 1 " ^ 2 - 7?)var(fi|CoFZ) + 

7(C.; y |C.f) 4 log, (1 + (f-fl)((l-^^)varm|Coy) + Z.) j ^ 

KCo; . 1 log, (^1 + (^-^.)((1-^^)WT.|C„Z) + I» j _ 
IiC„-,znvZ) = i log, (1 + (^-^.Xd-gvarmlftV-Zi + Z^l j 

B Comparison with Shayevitz-Wigger Region 

In Theorem [1] set A = Vi,B = ¥2,0 — (Vqi W) and consider joint distributions over two blocks of the form 

PfjvwxYzs QvQViV2\iJVWS Pwuv Px\wuv Pyzs\x- (80) 

If we set Q%ViV2\iJVWS = Pvo\uvwsPvi\VoUVWsPv2\VoUVWS, the joint distribution is identical to that of the 
Shayevitz-Wigger region. With this distribution, Theorem [l] yields the following. (The parts in bold indicate 
the corresponding constraints of the Shayevitz-Wigger region.) 

i?o < {Ti,T2} 

Ro + Ri< I(UW;YVi)-I(UVWS;VoVi|Y)-H/(Vo;C/W^|Fiy) 
Ro + R2< I(VW;ZV2) -I(UVWS;VoV2|Y) -h/(Fo;T^M^|V2Z) 
i?o + ^1 + /?2 < I(UW; YVi) + I(V; ZV2IW) - I(U; V|W) - I(UVWS; VoVi|Y) - I(UVWS; VajVoZ) 

+ /(Vb; UW\ViY) + /(Vb; ^1^2!^^) + /(^2; W\ZVo) 
Ra + Ri+R2 < I(VW;ZV2) +I(U; YVi|W) -I(U;V|W) -I(UVWS;VoV2|Z) -I(UVWS; Vi|VoY) 
+ I{Vo; VW\V2Z) + /(Vb; U\ViWY) + /(Fi; W\YVo) 
2i?o + /?! + /?2 < I(UW; YVi) + I(VW; ZV2) - I(U; V|W) - I(UVWS; VoVi|Y) - I(UVWS; V0V2IZ) 
-f /(Fo; UW\ViY) + /(Vb; VW\V2Z) 

(81) 
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where 

Ti = I(W;Y) + I{ViVo;Y\WU) ~ I{VS;ViVo\WU) + I(V2; Z\WVVo) ~ I{US;V2\WVVo) 

(82) 

T2 = I{W: Z) + 7(^2 Vb; Z\WV) - I{US; V^VqWV) + Iiyv,Y\WUVa) - I{VS; Vi\WUVo) 



C Proof of Lemma 15.4 



We show through induction that if Pr[£4[A;]] < e for A; = 1, . . . , Z — 1, then Pr(f4[Z]) < e if the conditions in the 
statement of the lemma are satisfied. 

For conciseness, let J" denote the event {r\'^^rJoS[k]T]£i[l]T]£2[l]T]£3[lY). Note that J" is the conditioning 



event in the statement of Lemma 5.4 hence Ci[/ — 1] = C2[/ — 1] — C[l — 1], A[/ — 1] = — 1] and 
B[/ — 1] = — 1]. Recall that given C[Z — 1] and the indices Gc[^], Ga[1], Gu[l — 1], the following sequences 
are determined: 

\J[l - 1] = U[/_i.G,^[(_i]^c[i-l],A[/-l]], 

C[l] = C[i^Ccli],cii-i]], (83) 
= A[;_G^[;]^u[i-i].c[;-i],c[;]]- 

Define the following indicator random variable: ipihj, k) = 1 ii the tuples 

(U[i_i,fc,c[(-i], A[/-i]], A[/-1],Y[?-1],C[Z-1]) and (C[,^,^c[;-i]], Y[/], A[;j_U[i_i,,,c[i-i],A[,-i]],c[;-i],C[i..,c[,-i 
are jointly ertypical with respect to Pqayccay ^^'^ ^ otherwise. We have 



Pr(£4|-F) = P(3 iij,k) ^ iGc[l],GA[l],Gu[l-l]) s.t. V(«,i,^) = 1 I 



(84) 



where 



$1 = Pr(3 J ^ Ga[1] s.t. il;iGc[l],j,Gu[l - 1]) - 1 | -F), (85) 

$2 = Pr(3^^Gcm,JS.t. ^{t,j,Gu[l-l]) = l\T), (86) 

<i>3 = Pr{3k^Gu[l-l],J s.t. ij(Gc[l],3,k) = l\:F), (87) 
$4 = Pr(3 ^ ^ Gc[llk^ Gu[l - 1],J s.t. V^(z,j,A:) = l\T). 



C.l Upper bound for $i 

Using the union bound, we have 



<f 1 < 5] Pr [ m - l],A[l - l],Y[l - 1], C[/ - 1], CH, YH, A[,,,,u[/-i],c[/-i],c[i]]) 

(89) 

eA^/SPcAUYCAY)^ Ga[1]^j\T 
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For brevity, we denote the tuple {IJ[1 - l],C[l - l],C[l]) by T' and the tuple (U[Z - 1], C[; - 1], C^, A[/ - 
1], Y[; - 1]) by T. (|89]) can then be written as 

*i ^ !mf1 51 E P'-IT = t, A[,,-,T'] = a, Y[l] = y, Ga[1] ^ j] 

= IMF1 E Pr[T = t]Pr[A[,,i,T']-a, YW=y, G^[/]^l|T = t] 

' t.a.yeA,, 

where the second equality is due to the symmetry of the codebook construction. We note that the index Ga[1] 
is a function of the entire ^-codebook {A[; ^ u[;-i].c[;-i].c[i]] i j — ^ ■ ■ ■ and so the events 

Ga[1] ^ 1 and (A[i,i^u[i-i],c[/-i],c[/]] = a, Y[/] = y) 



are dependent. This dependency can be handled using the technique developed in 28 



Let C be the set {A[;_j_u[;_i]_c[;-i],c[;]]7 j = 2 . . . 2"^^}, i.e., C is the A-codebook without the first codeword. 
Focusing on the inner term of the simrmation in (90), we have 

Pr[A[u,T'] = a, Y[l] = y, Ga[1] ^ 1 | T = t] 

< Pr[A[,,i^T'] = a, Y[/] = y | Ga[1] ^ 1, T = t] 

- Pr[A[,,i^T'] = a, Y[l] = y | G^^ ^ 1, T = t,C = c] • Pr[C = c | Ga[1] ^ 1, T = t] 

C 

^Pr[A[,,i,T'] = a I GaM ^ 1,T = t,C = c] • Pr[Y[/] = y | G^W 7^ 1, T = t,C = c] -Prp = c | G^ W ^ 1, T = t] 

C 

< 2 • Pr[A[i,i^T'] - a I T' - t'] ^ Pr[Y[/] - y | G^H ^ 1, T - t, C = c] Pr[C = c | Ga[1] ^ 1, T - t] 

C 

< 4 • Pr[A[,,i,T'] = a I T' - t'] Pr[Y[/] = y | T = t] 

(d) ~ ~ 

(91) 

In the chain above, (a) is true because given Ga[1] 7^ 1, the following Markov chain holds: 

A[i,i,T']-(C,T)-XH-Y[/]. 

(d) follows from Property 3 of typical sequences, while (6) and (c) are obtained from the following claim, 
proved along the lines of [28} Lemmas 1 and 2]. 

Claim 1. 1. Pr[A[,,i^T'] - a | Ga[1] ^ 1,T = t,C = c] < 2 Pr[A[,,i,T'] = a | T' = t']. 
2. Pr[Y[l] = y \ Ga[1] 1, T = t] < 2 Pr[Y[l] = y | T = t]. 
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Proof. We have 



Pr[A[,a,T'] = a I Ga[1] ^ 1, T = t,C - c] 



Pr[G^[;] ^ 1,1 T = t, C = c] 



< 



Pr[GA[Z] 1 I T = t, C = c] 
Pr[A[,,i,T'] = a I T' = t'] 



(92) 



Pr[G^[/] ^l|T = t, C = c] 

where the last equahty holds because each codeword of the codebook {A[j j x'] i J = 1, • ■ • 2"''^ } is mdependently 
generated, conditioned only on the symbols of T'. We now provide a lower bound for the denominator of (92 1. 

Ft[Ga[1] 7^1|T = t,C:=c] = l- Pr[GA[l] = 1 | T = t,C = c] 



> 1 - Pr 



(A.i,T'i,T)e4r)(Pc 



4/,l,T'li •^e, y-' CAUYCAJ 

1 

> - 



(93) 



for sufficiently large n. Substituting in (92) completes the proof of the first part of the claim. 
For the second part, we write 



Pr[Y[Z]=y|GAW^l:T = t] = 



Pr[YM=y,GA[/]^l|T = t] 
Pr[GA[/]y^l,|T = t] 



< 



Pr[Y[;] = y I T = t] 

- Pr[G^[/]^l|T = t] (94) 
^ Pr[YH ^ y I T ^ t] 
(2"Pi - l)/2"''i 

< 2.Pr[Y[;] =y I T = t] 

for large enough n. The second equality in the chain above due to the symmetry of the codebook construction. 
This completes the proof of the claim. □ 



Substituting the bound from (91 1 in (90 1, we obtain 

onpi , . , . - - - - ^ - 

< A Pi.[T = t] 4 • 22"*('') • 2^"-f^(^IC'^'^) . 2-"-f^('*'IC'^c^'4y) 

^^^'^^ t a,ye^,,(.|t) 
(a) O^Pl -~ / 

Pr[J^] ^ V / 

4 . 23n<5(«i) . 2"Pi . 2"-'^(^^lcc'ifAy) 
~ Pr[J"] • 2"-f^(^l'^C'i/) . 2^ii^^\ccij AY) 



(95) 



where (a) follows from the upper bound on the size of the conditionally typical set (Property 2). 
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C.2 Upper bound for $2, $3, $4 

Using the union bound, we have 

*2 < E E - 1], A[; - l],Y[l - l],C[l - 1], Y[/], C[,,,,c[;-i]], A[i,,-u[i-i],c[i-i],C[, . c,-.,]) 

.=1 j=i ' (96) 

To keep the notation manageable, in the next few equations we will use the shorthand for C[i.i,c[i-i]]- 
We also redefine T' as the tuple (U[/ - l],C[l - 1]) and T as the tuple (U[Z - l],C[l - l],A[l - l],Y[l - 1]). 
( 96 1 can then be written as 

2"P0 2^Pi 

^ fin EE E ^ t, C. = c, A[,,,- T'.c] = a, Y[/] = y, Gc[l] ^ t] 

^ ^=l j = l t,c,a,yeyl., ^^^^ 

2"(po+pi) ^ , 

= p r^. E Pr[T = t]Pr[Ci=c, A[,,i,T',c,]=a, Y[;]=y, Gcm^l|T = t] 

where the second equality is due to the symmetry of the codebook construction. We note that the index Gc [I] 
is a function of the entire C-codebook {C^ = C[/ ^ c[;-i]]i i — 1 ■ ■ ■ 2"P"} and so the events 

Gc[l] ^ 1 and (Ci = c, A[i^i^u[i-i],c[i-i].Ci] = a, Y[/] = y) 

are dependent. Define C as {C^ = C[; j i — 2 . . . 2"''"}, i.e., the C-codebook without the first codeword. 

We then have 

Pr[Ci =c,A[,,i,T',Ci] =a,Y[/] =y, Gc[l] ^ I \T = t] 

< Pr[Ci = c, A[u,T',Ci] - a,Y[l] = y | Gc W ^ 1, T = t] 

= E = c, A[z,i,T'i = a, Y[/] = y I Gc[^] ^ 1, T = t,C = c] • Pr[C = c | Gc W ^ 1, T = t] 

C 

E ^'^^^ = ^' A[U,T'] = a I Gc [/] ^ 1, T = t, C = c] • Pr[Y[/] = y | GcW ^ 1, T = t, C = c] • Pr[C = c \ Gc[l] ^ l,T = t] 

c 

(b) 

< 2 • Pr[Ci = c, A[i,i,T'] = a I T' = t'] ^ Pr[Y[/] = y | Ga[Z] ^ 1, T - t,C - c] Pr[C ^ c\ Ga[1] ^ 1,T ^ t] 

C 

< 4 • Pr[Ci = c, A[,a.T'] = a I T' = t'] Pr[Ym = y | T - t] 

<^ 4 • 2-"(-H-(ci<5)+-H(^|cc';7)-5(ei)) . 2-"(-H"('*'l<5C>^i^)-'5(^!)) 

(98) 

In the chain above, (a) is true because given Gc[l] 7^ 1, the following Markov chain holds: 

(C[,.i,c[i-i]], A[,,i,T']) - (C, T) - XH - Y[/]. 
(d) follows from Property 3 of typical sequences, while (6) and (c) follow from arguments very similar to Claim 
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[T]in the previous subsection. Substituting the bound from (98) in (97), we obtain 
2"(po+pi) 



$2 < 



Pr[J-] 



^Pr[T 



c,a,ye^,, (.|t) 



(a) on(po+Pl) , . , - 

f ^ Pr[T = t] 2"(-^(^^'^l'^'^"^'*')+''('^'') M • 2^"'^'^'^') • 2~"-^('^l'^) • 2~"-f^(^lcc''^) . 2-"-f^('*'IC''^^^) j 

t 



Pr[J-] 

^ . 23n>5(ei) . 2"(Pi+Po) . 2"^(-4i'C|C(7AY) 

Pr[J"] • 2"-'^('^l'5) • 2"-'^(^IC'C'C/) . 2n-W(i'|cc'yAy) 



(99) 



where (o) follows from the upper bound on the size of the conditionally typical set. 
In a similar fashion, we can obtain the following bounds for $3 and $4. 



$4 < 



4 . 23n'5(ei) . 2"(-R'i+pi) . 2"^('!^-4y iccly) 
Pr[J"] • 2"-f^(^l'5) • 2"-f^(^ICC''^) • 2"-f^('*'ICC'^^) ' 

4 . 23n<5(ei) . 2"(-R'i+Po+Pi) . 2"HiUCAY\CAY) 

Pr[J"] • 2"-'^(^l<5) • 2"-'^('^l'5) • 2"-f^(^lcc'C/) . 2n-H(i'|C'^i') ' 



(100) 
(101) 



Lemmas 5.1 and 5.2 together with the induction hypothesis that Pr[f4[A;]] < e for k = 1, ... ,^ — 1 imply 
that Pr[J^] > 1 - 5el, which is close to 1 for e <C 1/L. Thus the bounds ([95]), ( pO] ) and ( [lOl| ) can be 

made arbitrarily small for sufficiently large n if the conditions of the lemma are satisfied. 

Substituting back in (84), we obtain P[£4[/] | J-"] < e for all sufficiently large n. Similarly, one can show 
that P[£5[Z] I -F] < e if the conditions in the statement of lemma are satisfied. 
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